Spark: e2e Verifiability

In this Spark protocol explainer, we talk through each step of the Spark protocol. These are

  • Deal ingestion
  • Task sampling
  • Retrieval Request
  • Measurement Submission
  • Evaluation of measurements
  • On-chain Evaluation of checkers & rewards

In this doc, we discuss the verifiability of each step.

GitHub issue tracking this work: https://github.com/space-meridian/roadmap/issues/182

Deal Ingestion

This step is verifiable and can be rerun since all the Eligible Deals are stored on the Filecoin Blockchain.

In theory, someone should be able to recreate the exact state of the Spark Eligible Deal database by running the deal ingestor across the entire Filecoin Blockchain.

In practice, this may be tricky as we use snapshots to get hold of data around recently made deals to run the deal ingestor logic.

However, the Spark Eligible Deal Database is currently not public and so it is hard to compare a rerun to the current state of the database.

Potential Work Items

  • Make the Eligible Deal Database public, verifiable, hash-linked, immutable
  • Include verifiable deal expiration logic

Task Sampling

The logic for choosing the Round Retrieval task List from the Eligible Deal database is currently not verifiable. This is done at random each round using the Postgres Random function.

Given the Round Retrieval Task List for the round, the remainder of this step is verifiable and repeatable.

The round details for each round of Spark is available with the following API:

https://api.filspark.com/rounds/meridian/{smart-contract-address}/{round-index}

Checker nodes use the following URL that redirects to the contract+round permalink: https://api.filspark.com/rounds/current.

To calculate which tasks a particular Spark Checker should’ve performed in a given round, you can take the maxTasksPerRound field from the above API response and calculate which tasks are closest, and therefore applicable, to a Spark checker give their Station Id.

Potential Work Items

  • Use drand or some other deterministic algorithm for sampling the Round Retrieval Task List from the Eligible Deal Database
  • Double check that each detail in the round details object such as maxTasksPerRound is verifiable and can be created from public immutable data.

Retrieval Requests and Measurement Submission

This step is not verifiable or repeatable and it will never be. In fact, if this step was easily verifiable then the Spark protocol would’ve been built years ago. The Spark protocol in essence is an effort to make this step trustless.

We cannot recreate the exact conditions under which a Spark checker made a retrieval request. Perhaps their internet stopped working momentarily and they recorded an error. Perhaps they got a runtime error from IPNI. Perhaps they cheated and made up the measurement details without running the request.

Many features in the Spark protocol, such as honest majority consensus, are in place simply because we cannot make this step of the protocol verifiable and repeatable and never will be able to. The aim of Spark is to create the conditions whereby the actors act honestly based on the incentives.

Potential Work Items

  • Keep working on the hard problems in the protocol such as how we use the honest majority consensus and the checker reputation system.

Evaluation & rewards

Given a set of submitted measurements, this step of the protocol is fully verifiable, but currently has one gap. If the production Spark evaluate service crashes half way through a round, then it will recover but only start processing new measurements in the round, not the ones it has already processed. This means it will evaluate the checkers in that round only based on a subset of the overall set of measurements for the round.

If someone was aiming to recreate this step for a round in which the Spark evaluate service crashed, they would not be able to do so because they would run the algorithm on the set of all measurements for the round and not just the subset of measurements that were processed after the service crashed.

Once we have shipped the graceful recovery logic for the Spark evaluate service, this step will be fully verifiable and replicable.

You can fetch all the measurements for a round from Storacha and run the Spark evaluate service on these measurements to create the evaluations. You can then compare these evaluations with the original on-chain evaluations.

The on chain evaluation are processed into rewards on chain and so this is fully verifiable.

Potential Work Items

  • Graceful mid-round recovery for Spark evaluate (Github)